Keir Fraser [Thu, 26 Nov 2009 10:56:49 +0000 (10:56 +0000)]
xend: little fix for tap
Need get dev type after create tap device as device_create did.
Signed-off-by: Wei Kong <weikong.cn@gmail.com>
Keir Fraser [Wed, 25 Nov 2009 14:19:50 +0000 (14:19 +0000)]
libxenlight: move logging macros to the public header
This patch moves the logging macros to the public header so that they
can be reused by the client of the library. It also refactors the
code to create the qemu logfile into a generic function that can be
reused to create generic xen logfiles under /var/log/xen. Finally xl
is changed to log to file when running in background.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 25 Nov 2009 14:19:20 +0000 (14:19 +0000)]
libxenlight: clean up the domain when it dies
This patch adds two functions to libxenlight to be able to recognize
when a particular domain dies. After creating a domain, xl uses these
functions to wait for its death and clean up its resources.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 25 Nov 2009 14:15:57 +0000 (14:15 +0000)]
x86 time: Fix build and clean up.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 25 Nov 2009 14:12:58 +0000 (14:12 +0000)]
x86 hpet: Do nothing in hpet_broadcast_exit() if no timer deadline.
From: "Jiang, Yunhong" <yunhong.jiang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 25 Nov 2009 14:11:37 +0000 (14:11 +0000)]
libxenlight: implement stubdom support
this patch implements stubdom support for libxenlight:
- it adds two functions to find the stubdom domid of a domain and to
figure out if a certain domain is actually a stubdom;
- it moves all the device init functions from xl.c to libxl.c because
they are needed to setup the devices of stubdoms;
- it fixes some bugs in the pci setup that prevented pci passthrough
from working correctly with stubdoms.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Wed, 25 Nov 2009 14:11:02 +0000 (14:11 +0000)]
xm: Add maxvcpus support
this is patch to add maxvcpus support to xen xm command. It's using
vcpu_avail bitmask and sets the number of vcpus to maxvcpus if
present. If it's not present, old behavior is preserved.
In domain config file you can define it as follows:
maxvcpus = 4
vcpus = 2
this automatically sets vcpus to 4 and corresponding bitmask to
present 2 vcpus in the guest with option to increase it up to 4
vcpus. If maxvcpus is not present, the old behavior for vcpus is
preserved, ie. you can set vcpus to some number of vcpus to be used
and the vcpu_avail is set appropriately to use all of them. Only when
you use maxvcpus and vcpus new vcpu_avail value is calculated to show
PV guest the desired number of vcpus only.
It's been tested using RHEL-5 32-bit PV guest with maxvcpus = 4 and
vcpus = 2 and also the previous setup of vcpus = 2 only... In both
cases I was able to use 'xm vcpu-set {domainId} {numberOfVCPUs}' to
increase move vcpu count from 0 to maxvcpus/vcpus so it was working as
designed.
Signed-off-By: Michal Novotny<minovotn@redhat.com>
Keir Fraser [Wed, 25 Nov 2009 14:06:17 +0000 (14:06 +0000)]
cpuidle: Add decaying history logic to menu idle predictor
this patch is ported from linux upstream git commit
816bb611e41be29b476dc16f6297eb551bf4d747
the original description is:
"
Add decaying history of predicted idle time, instead of using the last
early wakeup. This logic helps menu governor do better job of
predicting idle time.
With this change, we also measured noticable (~8%) power savings on a
DP server system with CPUs supporting deep C states, when system was
lightly loaded. There was no change to power or perf on other load
conditions.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
"
In Xen environment, we also observe this patch reduce the idle power
fluctuation. In one DP server, when system is purely idle, the watts
stdev/average reduce from 6% to 2%. it is helpful for idle power
measurement accuracy. There is no performance and power change when
system is loaded.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Keir Fraser [Wed, 25 Nov 2009 14:05:28 +0000 (14:05 +0000)]
Replace tsc_native config option with tsc_mode config option
(NOTE: pvrdtscp mode not finished yet, but all other
modes have been tested so sooner seemed better than
later to submit this fairly major patch so we can get
more mileage on it before next release.)
New tsc_mode config option supercedes tsc_native and
offers a more intelligent default and an additional
option for intelligent apps running on PV domains
("pvrdtscp").
For PV domains, default mode will determine if the initial
host has a "safe"** TSC (meaning it is always synchronized
across all physical CPUs). If so, all domains will
execute all rdtsc instructions natively; if not,
all domains will emulate all rdtsc instructions but
providing the TSC hertz rate of the initial machine.
After being restored or live-migrated, all PV domains will
emulate all rdtsc instructions. Hence, this default mode
guarantees correctness while providing native performance
in most conditions.
For PV domains, tsc_mode==1 will always emulate rdtsc
and tsc_mode==2 will never emulate rdtsc. For tsc_mode==3,
rdtsc will never be emulated, but information is provided
through pvcpuid instructions and rdtscp instructions
so that an app can obtain "safe" pvclock-like TSC information
across save/restore and live migration. (Will be completed in
a follow-on patch.)
For HVM domains, the default mode and "always emulate"
mode do the same as tsc_native==0; the other two modes
do the same as tsc_native==1. (HVM domains since 3.4
have implemented a tsc_mode=default-like functionality,
but also can preserve native TSC across save/restore
and live-migration IFF the initial and target machines
have a common TSC cycle rate.)
** All newer AMD machines, and Nehalem and future Intel
machines have "Invariant TSC"; many newer Intel machines
have "Constant TSC" and do not support deep-C sleep states;
these and all single-processor machines are "safe".
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Wed, 25 Nov 2009 14:04:46 +0000 (14:04 +0000)]
hvmloader: Advertise ECC memory in SMBIOS tables.
Microsoft's Windows logo certified hardware requires ECC; since the
SVVP certification runs the same test on the guest, Xen domains will
currently fail it.
From: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 24 Nov 2009 14:43:07 +0000 (14:43 +0000)]
x86: Add a new physdev_op PHYSDEVOP_setup_gsi for GSI setup.
GSI 0-15 is setup by hypervisor, and GSI > =16 is setup by dom0
this physdev_op PHYSDEVOP_setup_gsi. This patch can help dom0
to get rid of intrusive changes of ioapic.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Tue, 24 Nov 2009 14:38:37 +0000 (14:38 +0000)]
tmem: fix freeable memory accounting error
Fix tmem accounting error that causes an "apparent"
memory leak, creating false negatives when testing
memory availability for launching a new domain.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Tue, 24 Nov 2009 14:37:59 +0000 (14:37 +0000)]
tmem: Fix another race in tmem on domain destroy.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Mon, 23 Nov 2009 15:19:38 +0000 (15:19 +0000)]
Revert 20457:
1bbc132675a2
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 08:06:54 +0000 (08:06 +0000)]
pygrub: add basic support for parsing grub2 style grub.cfg file
This represents a very simplistic aproach to parsing these file. It
is basically sufficient to parse the files produced by Debian
Squeeze's version of update-grub. The actual grub.cfg syntax is much
more expresive but not apparently documented apart from a few
examples...
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 08:06:19 +0000 (08:06 +0000)]
pygrub: track the title of an item as an independant field
separate to the other fields.
This makes the list of lines within a GrubImage 0 based rather than 1
based therefore adjust the user interface parts to suit.
This is in preparation for grub2 support where the syntax for the item
title does not fit the existing usage.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 08:05:49 +0000 (08:05 +0000)]
pygrub: factor generic Grub functionality into GrubConf base classes
and inherit from these classes to implement Grub-legacy functionality.
Use a tuple of (parser-object,configuration-file) in pygrub to allow
for multiple parsers.
Makes way for grub2 support.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 07:23:07 +0000 (07:23 +0000)]
xend: pci: show error msg properly if pciback/pci-stub are not loaded
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Mon, 23 Nov 2009 07:22:28 +0000 (07:22 +0000)]
minios: Fix fb/kbd initialization
When allocating kbdfront and fbfront structures, we should zero them
since we do not initialize all fields.
Signed-Off-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
Keir Fraser [Mon, 23 Nov 2009 07:21:58 +0000 (07:21 +0000)]
minios: Fix xenbus_unwatch_path calls
In a lot of places in MiniOS frontends, xenbus_watch_path_token is
used instead of xenbus_watch_path to get more precise wake ups. To
free those, xenbus_unwatch_path_token has to be used instead of
xenbus_unwatch_path, else the unwatch operation will fail. This fixes
spurious watch events left by pv-grub.
Signed-Off-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
Keir Fraser [Mon, 23 Nov 2009 07:17:32 +0000 (07:17 +0000)]
pygrub: expands tabs before displaying menus.
Otherwise the highlighting and line length trimming does not work as
expected and the display appears corrupted.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 07:17:10 +0000 (07:17 +0000)]
pygrub: if default entry is "saved" then use first entry.
pygrub doesn't support the "savedefault" command and will error out if
menu.lst uses the "default saved" directive. We might as well start on
the first entry in this case instead of failing.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 07:16:23 +0000 (07:16 +0000)]
xsm: Change format strings from signed to unsigned
...to reflect the variables being passed in.
Signed-off-by : Paul Nuzzi <pjnuzzi@tycho.ncsc.mil>
Keir Fraser [Mon, 23 Nov 2009 07:14:33 +0000 (07:14 +0000)]
pcifront: fix multiple initialization bug
Now that we have pcifront_watches to dynamically initialize pcifront
we don't need a call to init_pcifront in pcilib and pcifront_scan
anymore; we should just wait for the frontend to connect to the
backend instead.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 23 Nov 2009 07:13:59 +0000 (07:13 +0000)]
libxc: Minor tools bzip2/lzma decompression fixes
The attached patch cleans up a few minor problems in the bzip2/lzma
decompression support, pointed out by Jiri in internal review. In
particular, it fixes a possible memory leak on realloc() error, it
fixes a shifting typo, and it changes the xc_dom_printf()'s to be a
bit clearly about which compression routine is in-use.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Keir Fraser [Mon, 23 Nov 2009 07:12:06 +0000 (07:12 +0000)]
xend: Add support for XCP Windows PV drivers
This patch adds support for XCP Windows paravirtual drivers to run on
Xen. The drivers are currently provided in binary-only format from
Citrix. At a minimum, this patch is useful for performance comparisons
vs GPLPV drivers.
Live migration and save/resume are functional but set the guest clock
to the 1970's. The clock must be manually adjusted for the guest's ntp
to resume accurate timekeeping.
Before rebooting windows at the end of driver installation create the
registry key
HKLM\System\CurrentControlSet\Services\xenevtchn\Parameters. Add to it
a DWORD called SetFlags with a value of 0x10000000.
Signed-off-by: Keith Coleman <keith@scaltro.com>
Keir Fraser [Mon, 23 Nov 2009 07:10:56 +0000 (07:10 +0000)]
Remus: remove Py_RETURN_NONE for Python 2.3
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Mon, 23 Nov 2009 07:07:08 +0000 (07:07 +0000)]
Remus: fix a warning
This patch fixes the following warning:
xen/lowlevel/checkpoint/libcheckpoint.c: In function
`delete_suspend_timer':
xen/lowlevel/checkpoint/libcheckpoint.c:352: warning: assignment
makes integer from pointer without a cast
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Mon, 23 Nov 2009 07:06:39 +0000 (07:06 +0000)]
libxenlight: fix compilation error for ia64
xc_cpuid_apply_policy() and HVM_PARAM_VIRIDIAN are defined on x86
only.
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Mon, 23 Nov 2009 07:06:10 +0000 (07:06 +0000)]
[IA64] Remus: ia64 counter part of
07f6d9047af4
This patch adds callbacks to xc_domain_save().
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Mon, 23 Nov 2009 07:05:34 +0000 (07:05 +0000)]
docs: descriptions of PSCSI_HBA and DSCSI_HBA
Add descriptions of PSCSI_HBA class and DSCSI_HBA class to XenAPI
document.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Mon, 23 Nov 2009 07:04:54 +0000 (07:04 +0000)]
libxenlight: fix memory leaks
In particular:
- all the temporary flexarrays allocated in the create
device functions must be freed;
- all the strings that don't need to be modified can be added as they
are
to these temporary flexarrays instead of duplicating them;
- any data returned to the user shouldn't be added to the global
memory tracker so that the user can free it whenever he wishes.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 23 Nov 2009 07:03:01 +0000 (07:03 +0000)]
VT-d: Call pci_enable_acs() in pci_add_device_ext()
Signed-off-by: Allen Kay allen.m.kay@intel.com
Keir Fraser [Mon, 23 Nov 2009 07:01:51 +0000 (07:01 +0000)]
libxenlight: check for early failures of qemu-dm
This patch makes xl create check whether qemu-dm has started
correctly, and causes it to fail immediately with appropriate errors
if not. There are other bugfixes too.
More specifically:
* libxl_create_device_model forks twice rather than once so that the
process which calls libxl does not end up being the actual parent
of qemu. That avoids the need for the qemu-dm process to be reaped
at some indefinite time in the future.
* The first fork generates an intermediate process which is
responsible for writing the qemu-dm pid to xenstore and then merely
waits to collect and report on qemu-dm's exit status during
startup. New arguments to libxl_create_device_model allow the
preservation of its pid so that a later call can check whether the
startup is successful.
* The core of this functionality (the double fork, waitpid, signal
handling and so forth) is abstracted away into a new facility
libxl_spawn_... in libxl_exec.c.
Consequential changes:
* libxl_wait_for_device_model now takes a callback function parameter
which is called repeatedly in the loop iteration and allows the
caller to abort the wait.
* libxl_exec no longer calls fork; there is a new libxl_fork.
* There is a hook to override waitpid, which will be necessary for
some callers.
Remaining problems and other issues I noticed or we found:
* The error handling is rather inconsistent still and lacking in
places.
* destroy_device_model can kill random dom0 processes (!)
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Keir Fraser [Mon, 23 Nov 2009 07:00:08 +0000 (07:00 +0000)]
libxenlight: correct broken osdeps.[ch] and make #includes consistent
osdeps.[hc] previously mistakenly declared and defined [v]asprintf.
These functions are available in the libc on most platforms. Also,
osdeps.h is used by xc.c but xc.c is not part of the library, so
osdeps.h is part of the public interface and should have a better
name.
So now, instead:
* osdeps.h is libxl_osdeps.h.
* _GNU_SOURCE is #defined in libxl_osdeps.h so that we get the system
[v]asprintf (and various other functions)
* libxl_osdeps.h is included first in every libxl*.c file (it needs
to be before any system headers so that _GNU_SOURCE) takes effect.
* osdeps.[hc] only provide their own reimplementation of [v]asprintf
if NEED_OWN_ASPRINTF is defined. Currently it is not ever defined
but this is provided for any platform which needs it.
* While I was editing the #includes in each .c file, I put them all
into the same order: "libxl_osdeps.h", then system headers,
then local headers.
* xs.h is included in libxl.h. This is needed for "bool"; it has to
not be typedefed in libxl.h because otherwise we get a duplicate
definition when including xs.h.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Mon, 23 Nov 2009 06:59:06 +0000 (06:59 +0000)]
libxenlight: Clean up logging arrangements
* Introduce new variants of the logging functions which include
errno values (converted using strerror) in the messages passed to
the
application's logging callback.
* Use the new errno-including logging functions everywhere where
appropriate. In general, xc_... functions return errno values or 0;
xs_... functions return 0 or -1 (or some such) setting errno.
* When libxl_xs_get_dompath fails, do not treat it as an allocation
error. It isn't: it usually means xenstored failed.
* Remove many spurious \n's from log messages. (The applications log
callback is expected to add a \n if it wants to do that, so libxl's
logging functions should be passed strings without \n.)
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Keir Fraser [Mon, 23 Nov 2009 06:58:19 +0000 (06:58 +0000)]
x86: enable directed EOI
This patch enables directed EOI on latest processor. With this, the
broadcast of EOI would be suppressed upon LAPIC EOI, so VMM is
required to perform a directed EOI to the IOxAPIC generating the
interrupt by writting to its EOI register.(Pls. refer SDM 3A 10.5.5)
This is useful for ioapic_ack_old to avoid the spurious interrupt
storm, which is the reason why ioapic_ack_new is used.
Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com>
Keir Fraser [Mon, 23 Nov 2009 06:56:01 +0000 (06:56 +0000)]
vt-d: enable PCI ACS P2P upstream forwarding
This patch enables P2P upstream forwarding in ACS capable PCIe
switches. The enabling is conditioned on iommu_enabled variable.
This code solves two potential problems in virtualization environment
where a PCIe device is as signed to a guest domain using a HW iommu
such as VT-d:
1) Unintentional failure caused by guest physical address programmed
into the device's DMA that happens to match the memory address range
of other downstream ports in the same PCIe switch. This causes the
PCI transaction to go to the matching downstream port instead of go to
the root complex to get translated by VT-d as it should be.
2) Malicious guest software intentionally attacks another downstream
PCIe device by programming the DMA address into the assigned device
that matches memory address range of the downstream PCIe port.
Corresponding ACS filtering code is already in upstream control panel
code that do not allow PCI device passthrough to guests if it is
behind a PCIe switch that does not have ACS capability or with ACS
capability but is not enabled.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Keir Fraser [Mon, 23 Nov 2009 06:54:03 +0000 (06:54 +0000)]
libxenlight: implement support for pv guests
This patch makes pv guest work correctly with libxenlight. It also
implements support for vfb and vkbd, starting qemu in xenpv mode. Both
xenconsole and qemu are supported as console backends.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 23 Nov 2009 06:52:35 +0000 (06:52 +0000)]
xend: Remove tab indents
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Mon, 23 Nov 2009 06:48:14 +0000 (06:48 +0000)]
tmem: fix double-free bug
Tmem double-frees a high-level data structure causing memory
corruption under certain circumstances.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Mon, 23 Nov 2009 06:47:29 +0000 (06:47 +0000)]
rombios: don't busy-wait for keystrokes
Spinning waiting for the keyboard is a bit rude on a virtual
machine. Wait for an interrupt instead.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 06:46:58 +0000 (06:46 +0000)]
tmem: printk too chatty when tmem enabled
Two gdprintk's that are rarely encountered with tmem disabled
are frequent but meaningless when tmem is enabled. Printing
these tens-to-hundreds of times per second (in certain
circumstances even higher) slows down domain execution.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Mon, 23 Nov 2009 06:45:03 +0000 (06:45 +0000)]
tmem: fix regression from c/s 19886 "Remove page-scrub lists and async scrubbing"
Fix incorrect page_list macro choice from page-scrub code cleanup.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Mon, 23 Nov 2009 06:43:50 +0000 (06:43 +0000)]
x86 shadow: Relax assertion in VRAM tracking code
The original assertion is too strict, as it includes the A/D bits of
the PTE, which (by design) can change under our feet.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Mon, 23 Nov 2009 06:42:12 +0000 (06:42 +0000)]
blktap2: fix libgcrypt detection
If we want to check the functionality of libgcrypt, we shouldn't test
a function only exported by openssl, but instead the one actually used
in the code.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Keir Fraser [Tue, 17 Nov 2009 13:07:16 +0000 (13:07 +0000)]
Revert 20437:
64599a2d310d
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 17 Nov 2009 08:05:52 +0000 (08:05 +0000)]
blktap2: Remove uninitialised variable rc from tdremus_close().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 14 Nov 2009 10:32:59 +0000 (10:32 +0000)]
tmem: fix domain shutdown problem/race
Tmem fails to put_domain so a dying domain never gets
properly shut down. Also, fix race condition when
domain is dying by not allowing any new ops to succeed.
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Keir Fraser [Sat, 14 Nov 2009 10:25:19 +0000 (10:25 +0000)]
xend: Remove extraneous logging from pyxc_physinfo().
Also fixes 32-bit build.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 14 Nov 2009 08:09:50 +0000 (08:09 +0000)]
xend: Balloon down memory to achive enough DMA32 memory for PV guests
with PCI pass-through to succesfully launch.
If the user hasn't used dom0_mem=3D bootup parameter, the privileged
domain usurps all of the memory. During launch of PV guests with PCI
pass-through we ratchet down the memory for the privileged domain to
the required memory for the PV guest. However, for PV guests with PCI
pass-through we do not take into account that the PV guest is going to
swap its SWIOTLB memory for DMA32 memory - in fact, swap 64MB of
it. This patch balloon's down the privileged domain so that there are
64MB of DMA32 memory available.
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 13 Nov 2009 22:13:59 +0000 (22:13 +0000)]
stubdom: Fix up pciutils.patch
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 13 Nov 2009 22:00:19 +0000 (22:00 +0000)]
xsm: Dynamic update to device ocontexts
Added the ability to add and delete ocontexts dynamically on a running
system. Two new commands have been added to the xsm hypercall, add
and delete ocontext. Twelve new library functions have been
implemented that use the hypercall commands to label and unlabel
pirqs, PCI devices, I/O ports and memory. The base policy has been
updated so dom0 has the ability to use the hypercall commands by
default. Items added to the list will not be present next time the
system reloads. They will need to be added to the static policy.
Signed-off-by : George Coker <gscoker@alpha.ncsc.mil>
Signed-off-by : Paul Nuzzi <pjnuzzi@tycho.ncsc.mil>
Keir Fraser [Fri, 13 Nov 2009 21:59:20 +0000 (21:59 +0000)]
xen: allow stubdom to call unmap_domain_pirq
there is one missing IS_PRIV/IS_PRIV_FOR change in xen to make
xc_physdev_unmap_pirq work with stubdoms.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 13 Nov 2009 21:58:30 +0000 (21:58 +0000)]
pcifront: implement dynamic connections and disconnections
this patch implements dynamic connections and disconnections in
pcifront.
This feature is required to properly support pci hotplug, because when
no pci devices are assigned to a guest, xend will remove the pci
backend altogether.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 13 Nov 2009 21:54:44 +0000 (21:54 +0000)]
xend: call xc_assign_device for all the devices to hotplug
this patch fixes a couple of issues with pci passthrough in xend,
previously reported by Cui Dexuan.
The main problem is that xc_assign_device is called only for the first
device hotplugged into the guest and not the followings.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 13 Nov 2009 21:09:33 +0000 (21:09 +0000)]
remus: Add missing python __init__.py file
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 13 Nov 2009 17:21:13 +0000 (17:21 +0000)]
remus: Add missing unistd.h include from libcheckpoint.c
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 13 Nov 2009 17:02:25 +0000 (17:02 +0000)]
remus: Fix makefiles for indentation
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 13 Nov 2009 15:46:58 +0000 (15:46 +0000)]
Merge
Keir Fraser [Fri, 13 Nov 2009 15:38:57 +0000 (15:38 +0000)]
vtd: Make vtd faults dmesg more readable
This simple patch makes the VTd faults dmesg more readable and
helpful for debugging.
Signed-Off-By: Zhai Edwin <edwin.zhai@intel.com>
Keir Fraser [Fri, 13 Nov 2009 15:34:46 +0000 (15:34 +0000)]
Remus: support for network buffering
This currently relies on the third-party IMQ patch (linuximq.net)
being present in dom0. The plan is to replace this with a direct hook
into netback eventually.
This patch includes a pared-down and patched copy of ebtables to
install IMQ on a VIF.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Fri, 13 Nov 2009 15:34:03 +0000 (15:34 +0000)]
Remus: add control script to activate remus on a VM
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Fri, 13 Nov 2009 15:33:37 +0000 (15:33 +0000)]
Remus: add python control extensions
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Fri, 13 Nov 2009 15:31:45 +0000 (15:31 +0000)]
x86: Change the interface physdev_map_pirq to support new dom0.
It also keeps compatibility with old dom0.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Fri, 13 Nov 2009 15:31:16 +0000 (15:31 +0000)]
libxenlight: implement pci passthrough
This patch implements pci passthrough (hotplug and coldplug) in
libxenlight, it also adds three new commands to xl: pci-attach,
pci-detach and pci-list.
Currently flr on a device is done writing to
/sys/bus/pci/drivers/pciback/do_flr
pciback do_flr is present in both XCI and XCP 2.6.27 kernels.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 13 Nov 2009 15:30:24 +0000 (15:30 +0000)]
libxenlight: fix name to domid conversion
This patch makes sure that the domain name to domid conversion is
correct, cross referencing the information found on xenstore with the
list of running domains.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Thu, 12 Nov 2009 15:34:37 +0000 (15:34 +0000)]
x86: Disable spinlock checks temporarily while bringing a CPU online.
This is safe, as described in a code comment. Also fix up another
comment in start_secondary() while we're there.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 12 Nov 2009 13:15:40 +0000 (13:15 +0000)]
Don't assume vcpu_id's are contiguous in alloc_vcpu
When cpu hot-added, this assumption is broken because the hot-added
CPU may be brougt online by dom0 in arbitrary order. This patch avoids
making this assumption while still linking vcpus in ascending order of
identifier.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 12 Nov 2009 13:02:27 +0000 (13:02 +0000)]
Revert 20045:
db1890f07661 "Revert alloc_idle_vcpu()..."
The old implementation of alloc_idle_vcpu() is unnecessary since
arch-specific code ensures that a single idle domain supports NR_CPUS
vcpus, despite the usual limit of MAX_VIRT_CPUS for ordinary domains.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 12 Nov 2009 11:59:18 +0000 (11:59 +0000)]
x86: Remove non-CONFIG_HOTPLUG_CPU code, and general cleanup.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 12 Nov 2009 11:43:21 +0000 (11:43 +0000)]
Support physical CPU hot-add in xen hypervisor
This patch add CPU hot-add in system.
a) It mark all CPU as possible when booting, if CONFIG_HOTPLUG_CPU is
set. BTW, this will increase per_cpu area.
b) When a CPU is added through hypercall, the CPU will be marked as
present and offline, and the numa information is setup if numa is
supported. The CPU will be brought to online by dom0 online explicitly.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Thu, 12 Nov 2009 11:42:36 +0000 (11:42 +0000)]
Update pcpu_info hypercall interface
This patch change the XENPF_get_cpuinfo interface to pass only one
pcpu information each hypercall. Also, it replace
xenpf_resource_hotplug with XENPF_cpu_online/offline.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Thu, 12 Nov 2009 11:42:02 +0000 (11:42 +0000)]
A few trivial cleanups
Alphabetize object files and guest config options for better
readability. Also remove svm interrupt prototypes which do not
exist.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Thu, 12 Nov 2009 11:40:44 +0000 (11:40 +0000)]
xend/xm: Add PSCSI_HBA class and DSCSI_HBA class to XenAPI
XenAPI (not xapi) has supported only LUN assignment mode for pvSCSI.
But at last, HOST assignment mode also is supported by these patches.
To support HOST assignment mode, these patches add PSCSI_HBA class
and DSCSI_HBA class to XenAPI.
Signed-off-by: Masaki Kanno <kanno.masaki@jp.fujitsu.com>
Keir Fraser [Thu, 12 Nov 2009 11:39:51 +0000 (11:39 +0000)]
PoD: Handle operations properly when domain is dying
No populate-on-demand activities should happen when a domain is dying.
Especially, it is a bug for memory to be added to the PoD cache when
d->is_dying is non-zero, since if this happens after the cache has
been emptied, these pages will never be freed. This may cause "zombie
domains" to linger.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Keir Fraser [Wed, 11 Nov 2009 13:11:44 +0000 (13:11 +0000)]
blktap2: Remove gnu89-inline option from CFLAGS
Not supported by older versions of gcc.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 10 Nov 2009 13:04:45 +0000 (13:04 +0000)]
Mark CPU present when it is detected
Currently a CPU is marked as present only after it has been kicked off
successfully, i.e. before the CPU is brought up, it is not
present. This patch try to mark CPU as present when it is detected
(either through MPS table or ACPI). If it can't be brought up
successfully, it will be marked as non-present again. This change is
mainly for CPU hot-plug. As discussed, we'd take two step for physical
CPU hot-add. A CPU is firstly marked as present, and later will bring
as online.
Also, In smp_boot_cpus(), xen need only scan all present CPU, and no
need to loop from 0... NR_CPUS. With this change, the bios_cpu_apicid
is not needed anymore.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Tue, 10 Nov 2009 13:03:42 +0000 (13:03 +0000)]
Hypercall to expose physical CPU information.
It also make some changes to current cpu online/offline logic:
1) Firstly, cpu online/offline will trigger a vIRQ to dom0 for status
changes notification.
2) It also add an interface to platform operation to online/offline
physical CPU. Currently the cpu online/offline interface is in sysctl,
which can't be triggered in kernel. With this change, it is possible
to trigger cpu online/offline in dom0 through sysfs interface.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Tue, 10 Nov 2009 13:01:09 +0000 (13:01 +0000)]
tools: Make build again on netbsd
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 22:41:23 +0000 (22:41 +0000)]
libxl: Call to open() must specify mode with O_CREAT.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 22:30:21 +0000 (22:30 +0000)]
unlzma: Remove 'inline' decl from non-static function.
Breaks the build with some versions of gcc.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 20:43:40 +0000 (20:43 +0000)]
x86: Fix clip_to_limit().
There are issues in updating the e820 map in the middle of a loop that
iterates over it. For example, after memmove(&e820.map[i],
&e820.map[i+1], ...), the original e820.map[i+1] become current
e820.map[i] but the next loop count is i+1, so the original
e820.map[i+1] will be skipped.
Fix and clarify the code by making a double loop.
Original bug discovery and fix by Xiao Guangrong <ericxiao.gr@gmail.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 20:06:48 +0000 (20:06 +0000)]
cmdline_parse_early: fix parse 'edd=' option
If 'edd='is default, it should decrease "opt_edd" not "opt_edid"
Signed-off-by: Xiao Guangrong <ericxiao.gr@gmail.com>
Keir Fraser [Mon, 9 Nov 2009 20:05:43 +0000 (20:05 +0000)]
e820: fix e820_change_range_type()
In below case, e820_change_range_type() will return success:
[s, e] is in the middle of [rs, re] and e820->nr_map+1 >=
ARRAY_SIZE(e820->map) actually, it's failed, so this patch fix it
Signed-off-by: Xiao Guangrong <ericxiao.gr@gmail.com>
Keir Fraser [Mon, 9 Nov 2009 19:54:28 +0000 (19:54 +0000)]
libxenlight: initial libxenlight implementation under tools/libxl
Signed-off-by: Vincent Hanquez <Vincent.Hanquez@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 9 Nov 2009 19:45:06 +0000 (19:45 +0000)]
blktap2: add remus driver
Blktap2 port of remus disk driver. Backwards compatable with blktap1
implementation.
Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:41:16 +0000 (19:41 +0000)]
Remus: Fixup for tap:tapdisk syntax in remus uname
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:40:48 +0000 (19:40 +0000)]
blktap2: only open driver stack once
Currently blktap2 opens a driver stack, closes it, and re-opens
it. This causes problems with our remus driver: the primary may
connect to the backup in between the first and second open.
This is a temporary fix.
Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:40:14 +0000 (19:40 +0000)]
blktap2: configurable driver chains
Blktap2 allows block device drivers to be layered to create more
advanced virtual block devices. However, composing a layered driver is
not exposed to the user. This patch fixes this, and allows the user to
explicitly specify a driver chain when starting a tapdisk process,
using the pipe character ('|') to explicitly seperate layers in a
blktap2 configuration string.
for example, the command:
~$ tapdisk2 -n "log:|aio:/path/to/file.img"
will create a blktap2 device where read and write requests are passed
to the 'log' driver, then forwarded to the 'aio' driver.
Signed-off-by: Ryan O'Connor <rjo@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:19:27 +0000 (19:19 +0000)]
Remus: Make checkpoint buffering HVM-aware
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:17:22 +0000 (19:17 +0000)]
Remus: Do bitmap scan word-by-word before bit-by-bit.
For sparse bitmaps and large domains this saves a lot of time.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:16:48 +0000 (19:16 +0000)]
Remus: Do not bother with to_skip/to_fix bitmaps after the first final round.
Signed-off-by: Geoffrey Lefebvre <geoffrey@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:16:19 +0000 (19:16 +0000)]
Remus: Buffer checkpoint data locally until domain has resumed execution.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:15:34 +0000 (19:15 +0000)]
Remus: Initiate failover if a packet is not received every 500ms.
This breaks checkpoints at lower frequencies, and should be made
optional.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:14:03 +0000 (19:14 +0000)]
Remus: Make xc_domain_restore loop until the fd is closed.
The tail containing the final PFN table, VCPU contexts and
shared_info_page is buffered, then the read loop is restarted.
After the first pass, incoming pages are buffered until the next tail
is read, completing a new consistent checkpoint. At this point, the
memory changes are applied and the loop begins again. When the fd read
fails, the tail buffer is processed.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 19:06:25 +0000 (19:06 +0000)]
Remus: Add callbacks for suspend, postcopy and preresume in xc_domain_save.
This makes it possible to perform repeated checkpoints.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Mon, 9 Nov 2009 18:54:27 +0000 (18:54 +0000)]
x86, hvm: Make host TscInvariant CPUID flag visible to guest by default.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 08:19:55 +0000 (08:19 +0000)]
x86_32: Respect e820 map when populating Xen heap.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 9 Nov 2009 08:03:30 +0000 (08:03 +0000)]
x86, cpuid: mask TSC invariant bit for PV and HVM domains if migration
is not disabled and TSC is not emulated
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>